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Description 

The present invention relates to DNA probes for detecting a tandemly-repeated nucleotide sequence in the gene 
encoding mucin glycoprotein expressed by human mammary epithelial cells, to the use of the probe in diagnosis and 
5 in "fingerprinting" individuals, to the polypeptides expressed by the corresponding mucin gene, to antibodies against the 
polypeptides and to the use of the polypeptides and antibodies in the diagnosis and therapeutic treatment of cancer. 

Normal and malignant human mammary epithelial cells express high molecular weight glycoproteins (gps) which 
are extensively glycosylated and very antigenic. As a result, many of the monoclonal antibodies (MAbs) selected for 
reactivity with human breast cancer and other carcinomas are found to react with molecules which are produced in 
io abundance by the fully differentiated human mammary tissue and are found in the milk fat globule (MFG) and in milk. 
However, the level of expression of a particular antigenic determinant may be different in the gps produced by the nor- 
mal differentiated cell and in the similar molecules produced by breast cancers. This means that some antibodies can 
show a certain specificity for reacting with tumour gps. 

The molecules bearing the epitopes recognised by these antibodies are complex and have been difficult to analyse. 
is both because they are large and heavily glycosylated (>250,000 daltons) and because of the complex pattern of 
expression. Two of the MAbs, HMFG-1 and -2, react with a component in human milk which appears to be greater than 
400,000 daltons, whereas the molecules found in sera and tumours are smaller, although the dominant components are 
still greater than 200.000 daltons on immunoblots. The large glycoprotein produced by the differentiated mammary epi- 
thelial cells found in human milk or in the milk fat globule has been purified and shown to have some of the character- 
20 istics of the mucins. This component contains a large amount of carbohydrate joined in O-linkage to serine and 
threonine residues via the linkage sugar N-acetylgalactosamine. Moreover, the core protein contains high levels of ser- 
ine, threonine and proline and low levels of aromatic and sulphur containing amino acids. 

These mucin-like glycoproteins are also secreted by a number of other normal epithelial cells. The monoclonal anti- 
body HMFG-1 is highly reactive with the milk mucin and evidence suggests that the epitope recognised by this antibody 
25 is more abundant on the fully processed mucin, characteristic of normal differentiation. 

In tumours, the molecular weight of the molecules carrying these antigenic determinants differs among individual 
tumours and, in the case of the components recognised by the HMFG-2 antibody, can range from 80-400K daltons. 
Although it appears that the differences observed in the mobility of the high molecular weight bands are due to genetic 
polymorphism this probably does not explain variations in the size of the lower bands. It has been proposed that these 
30 may be the result of aberrant processing occurring in the tumour cell possibly within the glycosylation pathways. 

For the majority of the monoclonal antibodies reacting with this group of molecules the exact nature of the antigenic 
epitopes remains unclear but circumstantial evidence has suggested that carbohydrate may at least be partly involved 
in many of the epitopes. Moreover, from previously available data it was not known whether the mucin found in the nor- 
mal differentiated cells, and that observed in the tumours, contain the same core protein, or just carry common carbo- 
35 hydrate determinants. 

Mucin has now been isolated from human milk by affinity chromatography enabling identification of the core protein 
and the gene encoding the protein. This has been found to be a highly polymorphic gene def ined by the peanut urinary 
mucin (PUM) locus [see Swallow et aL Disease Markers. 4, 247. (1986) and Nature, 321, 82-84 (1987)]. The gene 
product, which is hereafter referred to as human polymorphic epithelial mucin or HPEM. has been detected in breast 
40 tumours and other carcinomas as well as in some normal epithelial tissues. 

It has now been found that the HPEM core protein has epitopes which also appear in the aberrantly processed gps 
produced by adenocarcinoma cells. Certain of these epitopes are not exposed in the fully processed mucin glycoprotein 
produced by the lactating mammary gland. 

In one aspect the present invention therefore provides an antibody against a human mucin core protein which anti- 
45 body substantially does not react with a fully processed human mucin glycoprotein. 

As used herein the term -antibody" is intended to include fragments of antibodies bearing antigen binding sites 
such as the F(ab*)2 fragments. 

Antibodies according to the present invention react with HPEM core protein, especially as expressed by colon, 
lung, ovary and particularly breast carcinomas, but have reduced or no reaction with the corresponding fully processed 
so HPEM. In a particular aspect the antibodies react with HPEM core protein but not with fully processed HPEM glycopro- 
tein as produced by the normal lactating human mammary gland. 

Antibodies according to the present invention have no significant reaction with the mucin glycoproteins produced 
by pregnant or lactating mammary epithelial tissues but react with the mucin proteins expressed by mammary epithelial 
adenocarcinoma cells. These antibodies show a much reduced reaction with benign breast tumours and are therefore 
55 useful in the diagnosis and localisation of breast cancer as well as in therapeutic methods. 

The antibodies may be used for other purposes including screening cell cultures for the polypeptide expression 
product of the human mammary epithelial mucin gene, or fragments thereof, particularly the nascent expression prod- 
uct In this case the antibodies may conveniently be polyclonal or monoclonal antibodies. 
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esent invention may be produced by innoculation of^ffcafc 



Antibodies according to the present invention may be produced by innoculation of sGuable animals with HPEM core 
protein or a fragment thereof such as the peptides described below. Monoclonal antibodies are produced by the method 
of Kohler & Milstein (Nature 256. 495-497/1975) by immortalising spleen cells from an animal innoculated with the 
mucin core protein or a fragment thereof, usually by fusion with an immortal cell line (preferably a myeloma cell line), of 
5 the same or a different species as the innoculated animal, followed by the appropriate cloning and screening steps. 

In a particular aspect the present invention provides the monoclonal antibodies designated SM3 against the HPEM 
core protein. In another aspect the invention provides the hybridoma cell line which secretes the antibodies SM3 and 
has been designated HSM3. Samples of HSM3 have been deposited with ECACC on 7th January 1987 under acces- 
sion number 8701 0701 . 

w Using antibodies according to the invention it has been possible to screen a phage library constructed from mRNA 
isolated from a human breast cancer cell line to identify sequences coding for portions of the mucin core protein. Com- 
plementary DNA sequences have been constructed and from these it has surprisingly been found that the gene encod- 
ing the core protein contains multiple tandem repeats of a 60 base sequence leading to considerable polymorphism 
sufficiently extensive that cDNA fragments corresponding to the repeat sequence would be useful for fingerprinting 

is DNA. The fingerprinting thus made possible has applications in for instance ascertaining whether bone marrow growth 
after transplants is from the host or the donor and in forensic medicine for identifying individuals using body tissues or 
fluids. 

Accordingly the present invention also provides a nucleic acid fragment comprising at least 1 7 nucleotide bases the 
fragment being hybridisable with at least one of 
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c) RNA having a sequence corresponding to the DNA sequence of a) and 

d) RNA having a sequence corresponding to the complementary DNA sequence of b). 
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The sequences in (a) and (b) each include a double tandem repeat sequence of 120 bases. Fragments according 
. to the invention may correspond to any portion of this sequence including portions bridging the start point of the repeat 
Fragments according to the invention will hybridise under conditions of low stringency with the DNA and RNA 
sequences (a) to (d) above. Preferred fragments are those which also hybridise under conditions of high stringency. The 
55 most preferred fragments of the invention are those which have sequences exactly identical to, or exactly complemen- 
tary to the sequences (a) to (d) above. 

Normally a given DNA or RNA fragment according to the invention will be capable of hybridising with both DNA 
according to a) and RNA according to c) or with both DNA according to b) and RNA according to d) above. 
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a portion of at least 30 nucle- 
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Preferably the nucleic acid <r"nt according to the present invention will coW 
otide bases capable of hybridising with at least one of a) to d) above, more Preferably at 

preferably the fragment contains a sequence of 60 bases exactly M**™^ £ ^^^^S^S^ 



ments have sequences corresponding to at least a portion of the sequences 

a) GTG GGC TGG GGG GGC GGT GGA GCC 
a ') CGG GGC CGG CCT GGT GTC CGG GGC CGA GGT GAC AC 

b ) DNA complementary to the sequence of a ) or a"). 
c 1 ) RNA having a sequence corresponding to the sequence of a") oral ana 
J) RNA having a sequence corresponding to one of the complementary DNA sequences of b ) 

,n the human genome the DNA tandem repeat sequence comprises antiparalle. double stranded DNA. one strand 
S,S ISTSSc acid fragments of the present invention may also be used in active immunisation techniques In such 

mmmmmmm 

S JSSed as a S^We product by conventional techniques. In one aspect the polypeptide product may be 
then expressed as a poiypepnoe „. ^ harvested and used as an immunogen to induce 

Val Thr Ser Ala Pro Asp Thr Arg Pro Ala Pro Gly Ser Thr 
Ala Pro Pro Ala His Gly'Val Thr Ser Ala Pro Asp Thr Arg 
Pro Ala Pro Gly Ser Thr Ala Pro Pro Ala His Gly 

55 '"" cs: zssz, — . ™ — p-*- * - — - - 40 

sequence above and may include the start point of the repeat sequence. _ sequence. 

Other polypeptides according to the invention include three or more repeats of the 20 ammo acd repeat sequence 
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I acid residues. 



TO 



15 



20 



25 



30 



35 



Suehpotypep M esm*y,no,»,emi^ 

The invention further proves polype"** oligosaccharide moieties to mat or via other .ink. 

Zoo add rescues, preferably oonforming ,o fhe ammo t « s %nSd b, ma PUM 9 ane and 
,„ , Rather aspect me presentinvermon *^^°2^Xowta»on * "-""«" « non-human cells, 

may be produced by recombinant ONA ^^"TrT* ^ucin glycoprotein which itself ma, be 
Alternatively I. may be obtained by empprng eri full processing in a human cell 

testa and tumour localisation and in ^ tegmenta mereoi. against any of me ' ' 

„ diagnosticall, effective Ikjands. ^^*^^^o^SSJ^?<XlS agents Include toxin., radioiso- 

patient) or in order to vary the isotype of the supports and detectable labels such as 

In the diagnostic field the antbod.es may be l.nked to hgan* J?*** , "r*£ directl or indirec y y detectable 
enzyme labels, chromophores and fluorophores as well as *^^.™J*™* d,reCt,y ° 
S Preferably monoclonal antibodies or fragments ^^^^^ng a sample suspected to contain 
The invention further provides a diagnost,c W<^»W «* SiJSud. tumour localisation 

n~seSf i° SS^S^ assays *r det^ng an^or assessing the sever* of breast. 
OVar CnS teSSare provided lor use in diagnostic assays and comprise antibodyo, ^fragment thereof, option- 

ing drawings in which 

Figure Legends 

^Purtica^otmer^-n^ 

several indKhduala were combined and atsortted too »«l Scarce « I u (Mck ,). The 

^^^^^^ff^^m^^. S, HMFG-2 
2) ST254(track3)andRPMI + 20%FCS(track4). 

ton and Hunter method and subjected to SDS polyacrylam.de electrophoresis and autoraa.og bp y 
F^: Autoradiography o. the iodinated milk mucin after treatment whh hydrogen fluoride. The purHied milk 
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mucin was treated with HF^hours at room temperature (track 1) or 1 h<^4°C (track 2) and the resulting 
preparations were then iodinated and run on SOS polyacrylamide gels. 

Figure 4 : Reactivity of the intact, partially stripped or extensively stripped milk mucin with iodinated lectins The 
pSedlntact milk mucin (track 1 ). the mucin treated with HF for 1 hour it 4-C (track 2) and the muc.n treated tar 3 
hours at room temperature (track 3) were subjected to SDS polyacrylamide electrophoresis and then transferred to 
nSJiote paplr. The paper was then probed with «. PNA (peanut agglutinin). «. WGA (wheat germ agglu- 
tinin). or 125 l HPA (Helix pomatia agglutinin). 

Figure 5 : Immunoprecipitation and immunoblot of the partially and extensively stripped mucin. A. the 125 l exten- 
s^TsTripped mucin was immunoprecipitated with SM-3 (track 3). HMFG-2 (track 2) or NS2 medium as a control 
(track 1) by the protein A plate method (see Materials and Methods). B. the partially stripped mucin track 1) or 
extensively stripped mucin (track 2) was run on SOS polyacrylamide gels and transferred to nitrocellulose paper. 
The blot was then reacted with a cocktail of SM-3 and SM-4 monoclonal antibodies and the binding detected using 
an ELISA method. 

Fioure 6 - Reactivity of monoclonal antibodies SM-3 and HMFG-2 with methacarn f ixed breast tissue and tumour 
iectionl using an indirect immunoperoxidase staining method. Inflating ductal carcinoma showing strong reactiv- 
ity with both SM-3 (A) and HMFG-2 (B) Fibroadenoma showing no reactivity with SM-3 (C) and strong heterogene- 
ous staining of the epithelium with HMFG-2 (D). Papilloma showing very weak reactivity wrth SM-3 (E) and strong 
oositivity with HMFG-2 (F). Both normal resting breast (G) and lactating breast (I) were negative when stained with 
SM-3. whereas both tissues stained positively with HMFG-2 with lactating breast (J) much stronger than normal 
resting breast (H). 

25 Finure Legends 

Raure 7 Periodic acid-silver stained milk mucin after antibody affinity column and gel filtration column. Milk mucin 
w^uTified on an HMFG-1 antibody affinity column (lane 1) followed by passage through a G75 Sephadex column 
(lane 2). subjected to NaDodSO^Iyacrylamide gel electrophoresis, and silver stained following treatment of gels 
so with 0.2% periodic acid. 

Figure 8. Silver stain of partially and totally stripped core protein from milk mucin The purified milk mucin ^was 
degl^sylated by treatment with anhydrous hydrogen fluoride for 1 hr at O'C (lane 1) and 3 hr at room temperature 
(lane 2) separated by electrophoresis through a NaDodSO^polyaaylamide gel (1 0%) end silver stained. 
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Raure 9 . Immunoprecipitation with MAbs of jnvjfiQ translated protein products from MCF,7 poly(A) RNA Poly(A) 
RTjAlrom MCF-7 cells was translated in a rabbit reticulocyte lysate system (Amersham) in the presence of 
^methionine (1000 Ci/mmole; iCi = 37 GBq) following the manufacture; ^^J 0 ^^ 
10 4 acid precipitate cpm were precipitated with MAbs SM-4 (lane a). SM-3 (lane b). HMFG-2 (lane c). HMFG-1 
(lane d) and an irrelevant MAb to interferon (lane e. 24). separated on a NaD0dSO4Zpolyacrylarn.de gel (10%). 
impregnated with Amplify and exposed to XAR-5 film at -70»C for 20 days. 

Figure 10 . Immunoblot analyse of fusion proteins from the Xmuc clones. The phage clones XMUC 3 4. 6 7, 8 9 
an^lu^ere used to lysogenize bacterial strain Y 1089. Lysogens were grown at 32'C shrfted to 42 C. and then 
frSuced with IPTG. LyWen proteins were fractionated by electrophoresis through a ^^S^S 
gel (7 5%). transferred to nitrocellulose, and reached with HMFG-2. The binding was detected with an ELISA 
method using 4-chloro-1 -naphthol as the substrate. The numbers are those of the X clones. 

Figure 11 . Hybridization of pMUC1 0 to cDNA inserts of pMUC clones. DN A from the plasmid clones was digested 
wrth restriction enzyme EcoRI to excise the cDNA inserts, separated by electrophoresis on 1 .4%. agarose and 
£n£m7to Biodyne nylon membrane. The fitter was hybridized using standard conditions (34) to the insert from 
pMUC10 which was labelled with [a-^pJdCTP by the method of random priming (41). Lanes: plasmid clones 3. 4. 
6. 7. 8. 9. 10. 

Raure 12 RNA blot hybridization analysis of mammary breast mucin mRNA. 10ng of total RNA from human breast 
cancer cells MCF-7 (lane 1) and T47D (lane 2). normal human mammary epithelial cells HuME (lane 3). human 
rmbryonS fibTo^ststcRF 23 (lane 4). Daudi cells (lane 5) and carcinosarcoma HS578T cells (lane 6) were sep- 
arated a 1 .3% agarose/glyoxa. gel. Wotted on to nitrocellulose and hybridized to the pMUClO EcoRI insert wh,ch 
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was labelled with [a- 32 p]dCT^ffie method of random priming (41 ). The size r^Prs were 28S (5.4 kb) and 1 8S 
(2.1kb) rRNAs. 

Finure 13 Polymorphic human ON A fragments detected by hybridization with P MUC10 probe. Genomic DNA sam- 
pies prepared from the white Wood cells from ten individuals (six unrelated) and from three cell lines were digested 
to completion with Hinfl and EcoRI. fractionated by electrphoresis through 0.7% and 0.6% agarose, respectively, 
and transferred to Biodyne nylon membranes. The filter was hybridized to the pMUCIO DNA insert which was 
labelled with [a 32 p]dCTP by the method of random priming (41). X-ray film was exposed for 1 day at -70*C with 
Intensifying screens. Lanes 1-4 father, two daughters and mother, lanes 5-10 unrelated individuals, lane 1 1 is MCF- 
7 lane 12 is ZR75-1 lane 13 is ICRF-23. The DNA samples exhibit a wide distribution of sizes. Numbers indicate 
length of DNA in kb. The apparent bands at 23Kb are in lanes 12 and 13 are artefacts introduced in autoradiogra- 
phy 

Example 1 

Purification of the milk mucin 

The milk mucin was purified from human skimmed milk by passage through an HMFG-1 affinity column followed by 
size exclusion chromatography. The HMFG-1 monoclonal antibody was purified from tissue culture supernatant using 
a protein A column (1). The purified anttoody was coupled to cyanogen bromide activated sepharose (Pharmacia) as 
described in the manufacturer's instructions. Human skimmed milk was passed in batches of 100 ml through the ant- 
body column followed by extensive washing with PBS. Bound antigen was eluted from the column us.ng 0.1 M glycine 
pH 2 5 and the fractions registering an optical density at 280nm were pooled, dialysed against 0.25 M acetic acid and 
lyophilized Batches of about 20 mgs were dissolved in 0.25 M acetic acid and passed through a G75 Sephadex column 
(1 x 100 cm) which had been previously equiligrated with acetic acid. The column was washed with 0.25 M acetic acid 
and 1ml fractions collected. The peak fractions which were eluted in the void volume were pooled, lyophilized and the 
dry powder stored at 4»C. Amino acid analysis was performed using a Beckman 6300 amino acid analyser. 

Dealvcosvlation of the milk mucin 

To remove the O-linked carbohydrate from the milk mucin the molecule was treated with anhydrous hydrogen fluo- 
ride as described by Mort and Lamport (21). for either 1 hour at 4'C which produced a partially stripped preparation, or 
3 hours at room temperature which produced the extensively stripped mucin. 



35 Inriination o f the milk mucin 
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lodinations of the purified mucin, the partially or extensively stripped mucin were carried out using the Bolton and 
Hunter method (51). Briefty. the mucin. 2.5 ng in 20 nl 0.1M borate buffer pH 8.5. was added to the dnedBolton and 
Hunter reagent (1 mCi. Amersham International pic) and incubated at room temperature for 15 minutes. The reaction 
was stopped by the addition of 0.5 ml of 0.2M glycine in borate buffer and after a further 15 minutes incubation, free 
Bolton and Hunter reagent was removed by passage through a G25 Sephadex column (PD10 columns. Pharmacia) 
previously equilibriated in PBS. 



Inclination of Lectins 



Wheat germ agglutinin (WGA). peanut agglutinin (PNA) (Vector Labs) and Helix pomatia agglutinin (HPA) (Boe- 
hringer) were iodinated as described by Karlsson et al. (52) using the chloramine T method. 

Polvacrvlamide pels and Western blots 

Polyacrytamide gel electrophoresis and immunoblotting was performed as described previously (1). Briefly, sam- 
ples were run on 5-15% polyacrylimide gels and then electrophoretically transferred to nitrocellulose paper (Schleicher 
and Schuell) it 50 volts overnight it 4'C (36). In the immunoblotting experiments the paper was reacted with monoclonal 
antibodies end binding detected with in ELISA method using 4-chloro-1 -naphthol as the^substrate. For lectin binding 
55 studies the Western Wots were reacted with the iodinated lectins as described by Swallow et al. (48). 
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Production of monoclonal arrtibofl^ 

A female BALB/c mouse was immunized with 5 ug of the partially stripped milk mucin in Freund's complete adju- 
vant and 3 months later boosted with a further 5 jig of the same preparation in Freund s incomplete adjuvant. After a 
further 20 days. 5 |xg of the mucin extensively stripped of its carbohydrate was given intravenously .n saline solution. 
The spleen was removed 4 days later, and fused with the NS2 mouse myeloma cell line (53). 

Screening of hvbridoma supernatan t and immunoorecioitations 

The screening assay was a modification of that described by Melero and Gonzalez-Rodriguez (54). Multiwell plates 
were coated with 50 pi of 0.1 mg/ml protein A (Pharmacia Fine Chemicals) in PBS and allowed to dry overnight at 37'C. 
The plates were blocked with 5% BSA for 1 hour at 37'C followed by the addition of 50 ul of rabbit anti-mouse immu- 
noglobulin (DAKO. diluted 1 .10 in PBS/BSA = PBS/BSA). After Incubating for 2 hours at 37«>C the plates were washed 
twice with PBS containing 1% BSA and 50 |il of hybridoma supernatant added. The plates were incubated overnight at 
4»C washed twice with PBS/BSA and 50 ul of iodinated partially stripped mucin containing 100,000 com added to each 
well The plates were then incubated it room temperature for 4 hours, washed 4 times with PBS/BSA and the individual 
wells counted in a gamma counter. For immunoprecipitation experiments 50 ul of SDS sample buffer containing dithio- 
threitol was added to each of the wells which were then boiled for 3 minutes and the buffer loaded onto 5-15% polyacr- 
ylamide gradient gels. 

Stainino of tissue sections 

Tissues from primary mammary carcinomas, benign breast biopsies, normal breast, and pregnant lactating breast 
tissue were fixed in methacarn (methanol chloroform and acetic acid 60:30:10) and embedded into paraffin wax. Sec- 
tions were stained with the antibodies using the indirect peroxidase anti peroxidase method as previously described 
(47). 



Results 

30 Purification of the milk mucin 

The milk mucin was purified from human skimmed milk on an HMFG-1 antibody affinity column, lodination of the 
eluted material revealed the presence of a large molecular weight component and a 68KD band. Precipitation of the 
affinity purified material with antibodies HMFG-1 and HMFG-2 (tracks 2 and 5) followed by gel electrophoresis showed 
35 that both the high molecular weight components and the 68KD component were precipitated by both antibodies (less 
effectively by HMFG-2). Since the 68KD component was also precipitated by two unrelated antibodies (figure 1 . tracks 
3 and 4) and this component was not evident on an immunoblot of the purified material reacted with HMFG-1 (figure 
2A) the 68K component was removed by molecular sieve chromatography on a G75 column. The final purified product 
showed a major high molecular weight band, with only a trace of the 68K component and a minor contaminant around 

40 14K (figure 2B). . _ ,. , . . 

A high molecular weight glycoprotein (PAS-O) containing more than 50% carbohydrate in O-linkage has been puri- 
fied from the human milk fat globule by Shimizu and Mnmauchi (8). To see whether this component was similar to the 
mucin isolated from milk by affinity chromatography on an HMFG-1 affinity column, the amino acid composition of the 
purified HMFG-1 reactive mucin was determined and compared to the amino acid composition of the purified PAS-O 
component Table 1 shows that there is good correspondence between the two sets of data, indicating that the core pro- 
teins of PAS-O and the mucin purified here are the same. 

Isolation of the core protein of th e milk mucin 

so As there are no enzymes easily available that are efficient at removing O-linked sugars, and p elimination often 
results in damage to the protein core, the oligosaccharides were removed by treatment of the mucin with anhydrous 
hydrogen fluoride. This treatment has been shown by Mort and Lamport (21) to be effective in removing sugars from 
pig submaxillary mucin without damaging the protein core. Amino acid analysis of the matenal produced after HF treat- 
ment of the milk mucin suggested that the protein core was also in this case undamaged, since the composition was 

55 the same as that seen in the intact mucin (Table 1). , , ,^«. 

Initially the milk mucin was exposed to HF for only 1 hour at 4-. but analysis of the product showed that there was 
only partial removal of the sugars with such treatment, and it was necessary to treat the mucin at room temperature for 
3 hours to obtain a molecule which showed no lectin binding ability. Figure 3 shows an autoradiograph of the iodinated 
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(track 2) or 3 hours fit RT (track 1). It can B from Hgure 3 that the milder 
^et^tTs JtsTn a mix^re of products made up of high molecular weight materia, which is slightly smaller than the 
ImacTldn and a number of smaller bands. After longer exposure to HF at room temperature, the h,gh mo.ecu.ar 
weight bands disappeared resulting in polypeptide bands d about 68KD and 72KD. 

To test for the presence of sugars on the intact mucin and on the products produced after the two different HF treat- 
ment e^h preparation was subjected to acrylamide ge. electrophoresis, transferred to nrtroce.U,.ose paper and 
^Sed S labelled Iectins . jhe lectins used were peanut lectin (PNA) wh,ch reacts with galactose linked to N- 
aceXtSosamfne. wheat germ (WGA) reactive with N-acety. glucosamine and Helix pomatia agglutinin (HPA) which 
SSs wrthtelnkage sugar in O-inked glycosylate, ^acetylgalactosamine. Figure 4 shows autoradiographs o the 
reSed Wote and it can be seen that while treatment with HF for 1 hr at 4- (track 2) alters the 'ec*" jead ivity of the 
rnS ^hydrate is sti.. present. Interestingly, however, there is a much lower .eve. of b,nd,ng of PfiA to the high 
mdecuS weight material of the partially stripped mucin than is seen with the intact mucn (track 1) Moreover, this loss 
Z pm WrSS atS ^s accompanied by binding of the linkage sugar specific lectin HPA. Th.s lectin shows no b.nd,ng . 
BESSES arSme changi pattern of lectin binding after .imited treatment with HF indica.^at sugars 
masWng me oSnked ^-acetylgalactosamine have been stripped off. The smaller component seen , ,n both the ,nta<* , 
mSraS 1) and* the partially stripped preparation (track 2) is a glycoprotein which reacts with WG* although not 
Z SST 152 may corre S P pond to the component of similar molecular weight (around 68K) seen after affinity chroma-; , 
tooraphy of the mucin and may represent an intermediate precursor molecule. a tRT\ " 

Roure 4 shows Cearfy that the 68K and 72K components produced after extensnre treatment with HF (3 hr at RT). 
sh J n reacWw«hVe lectins (track 3). including the N-acetylga.actoamine specific lectin HPA. This observaton 
II Song Zele that the sugars have been removed from at least the majority of the molecules, and we w,.< 
refer to this preparation as the extensively stripped mucin. 

feneration of monoclona l antibodies to th* milk mucin core protein 

A fusion was earned out using the spleen of a mouse that had been immunized with two injections of the partially 
stripped SZSSSid by a boost with the extensively stripped mucin. The clones were ini«a illy greened against 
me' 25 ! partially stripped material using protein A plates (see Methods). Four hybridomas were s ^^^. c '°"« , n 
Tnd tabfX shoL their spectrum of reactivity with the intact partially and extensively ^££"£££ 
from this table three or the hybridomas which were isolated showed a strong reachon wrth the partially and pensively 
SS?^S.^?« "5 wrth the intact mucin. These appeared to be good candidates for monoclonal antibod- 
ies to the protein core and two. SM-3 and SM-4. were selected to be characterised further. 

tt « a^b seen from table 2 that the HMFG-1 and HMFG-2 antibodies react ed very »^*^T£ 
ctrinn^f its carbohvdrate These two antibodies were, in fact developed using the Intact mucin (from the milk Ut g\db- 

growing mammary eprthetia. ce,.s (14). Taction 
unexpected, as drcumstantia. evidence had previous* led to the belief that carbohydrate might form at .east 
part of their antigenic epitopes. - - 

Molecular Weioht of mole c ules carrying antigenic determinants 

We have previously shown that the molecular weight of the components in breast «~ «*«2J3 aXSy 
narrts found on the milk mucin is lower than 400K and can vary from ^£^J^^S^^S- 
SM-3 wrth Western blots of gel separated extracts of breast tumour cells shows that this arrtbody rearts w^^r^>o- 
* nerrts of^milar molecular wight to those reactive wrth antibody H^^^"^'^ 
3 differs from the antibodies HMFG-1 and 2 in that it does not react wrth the .ntact mucin processed oy tne^ .a«a _ « 
gSdyS rcai wrth molecules processed by breast cancarce.ls.it was appropriate to examine thereat 

3 with a range of breast cancers. 

55 Reactivity of SM-3 with b reast tissues and tumours. 

The antibody SM-3 reacted wrth paraffin embedded tissues provided these were fixed h, JjS 
saline). Using this method for preparation of tissue sections, the reacton of the antibody was compared to that o 
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HMFG-2 on breast tissues and tl^fcrs with an indirect immunoperoxidase stair^^iethod. This analysis showed a 
dramatic difference in the staining pattern of SM-3 compared to that seen with HMFG-2. Thus, although a strong posi- 
tive reaction was seen in 20/22 breast cancers stained with SM-3 (as compared to 22/22 stained with HMFG-2), normal 
resting breast, pregnant or lactating tissues and most benign lesions were largely unstained with SM-3 but were stained 
with HMFG-2. Some examples of staining patterns of breast tissues and tumours are illustrted In Figure 6. 

Twenty-two primary carcinomas and fourteen benign lesions were examined and the reaction of SM-3 compared 
to the staining with HMFG-2 in each case. In the primary carcinomas, staining with SM-3 was heterogeneous but gen- 
erally quite strong and always confined to tumour cells; connective tissue and stroma showed no reaction (see figures 
6A.B). In the four fibroadenomas examined/staining of the epithelium with HMFG-2 was strong although heterogene- 
ous. In comparison, staining with SM-3 was negative in one case and in the three others staining was confined to only 
one or two glandular elements. HMFG-2 showed strong positivity on the five papillomas and five cases of cystic disease 
studied while the staining observed with SM-3 was very much weaker and more heterogeneous (figures 6G.H). The 
papillomas as a group showed the strongest staining with SM-3, and it can be seen that the staining was membranous 
or extracellular. 

In contrast to HMFG-1 and HMFG-2 which strongly stain lactating and pregnant breast, SM-3 was totally negative 
with three out of six cases of pregnant or lactating breast (see figure 6C and D). Two positive cases showed only very 
weak staining of an occasional cell and in the third, staining was confined to two areas of one lobule. Again, in contrast 
to HMFG-1 and HMFG-2 which do react with some terminal ductal lobular units of normal, resting breast (albeit 
weaWy). SM-3 was totally negative on eight out of the thirteen cases tested and in the other five cases staining was 
extremely weak and often confined to one or two acini in the tissue section (see figure 6E and F). It should perhaps be 
noted that the intensity of staining with HMFG-2 seen with normal breast tissues and benign lesions fixed in methacarn 
was somewhat higher than that reported previously using formalin fixed material (50,47). 

SM-3 was also shown to be negative on sections of normal liver, lung, thymus, sweat gland, epididymus. prostate, 
bladder, small intestine, large intestine, appendix, thyroid and skin. The antibody showed weak positive staining only 
with the distal tubules of the kidney, the occasional chief ceil of the stomach, the occasional duct cell of the salivary 
gland and the sebaceous gland. 



Discussion 

Large molecular weight mucin molecules are expressed by many carcinomas and carry many of the tumour asso- 
ciated antigenic determinants recognized by monoclonal antibodies. These epitopes may also be expressed by some 
normal epithelium, and some monoclonal antibodies like HMFG-1 react particularly well with a mucin found in normal 
human milk (1.17). As long as the study of the mucins is restricted to their detection with antibodies reactive with unde- 
fined epitopes, the knowledge of their structure, expression and processing will also be restricted. We have begun to 
investigate the structure and expression of the mammary mucin by isolating the core protein and developing antibodies 
which have allowed us to select partial cDNA clones for the gene coding for the core protein This Example describes 
the production and characterization of these antibodies. 

Treatment of the HMFG-1 affinity purified milk mucin with hydrogen fluoride resulted in the appearance of a domi- 
nant band of about 68K daltons and a minor species of about 72KD on SDS acrylamide gels. These bands showed no 
reactivity with lectins, including Helix pomatia agglutinin which is specific for N-acetyl galactosamine. the first sugar in 
O-linked glycosylation (55). It therefore seems probable that thin 68K dalton polypeptide represents the core protein of 
the mucin. Supportive evidence for this comes from the observation that the antibodies described here, which are reac- 
tive with the stripped 68K component, can precipitate a molecule of this size from the in vitro translation products of 
mRNA isolated from breast cancer cells expressing the mucin . 

As the milk mucin contains at least 50% carbohydrate (16),a protein core of only 68KD appears too small if the 
intact molecule has an observed molecular weight greater than 400KD. However, mucins can be composed of small 
subunits which aggregate and are held together by some form of non-covalent interactions, as yet not understood. For 
example, although the molecular weight of the ovine submaxillary mucin has been reported to be greater than 1x10 
daltons (45). it has a protein core of only 650 amino acids with a molecular weight of 58.300 daltons (46). 

An unexpected finding was that the antibodies HMFG-1 and HMFG-2 which react with the milk mucin, also show a 
positive reaction with the extensively stripped material which showed no lectin binding capability. Previous indirect evi- 
dence, including the resistance to fixation, boiling and reduction, the repetitive nature of their epitopes and the appear- 
ance of several bands on immunoblots, had led to the belief that carbohydrate present on the milk mucin was involved 
in these epitopes. This idea was reinforced by the observation that lectins could block the binding of HMFG-1 and 2 (1). 
While it is not possible to exclude the possibility that some sugars, detected by the lectin binding experiments, remain 
on the extensively stripped mucin described bore, this is likely to be the explanation for the reactivity of the antibodies 
HMFG-1 and 2. This can be said since both antibodies have recently been shown to react positively with p-galactosi- 
dase fusion proteins expressed by phage carrying DNA coding for the core protein of the mammary mucin It appears 
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therefore that at least part of each of the epitopes recognised by HMFG-1 and HMFG-2 contain amino acids but it must 
be assumed that some of these epitopes on the core protein are exposed, i.e. not masked in the fully glycosylated mol- 
ecule. The HMFG-2 epitope is however loss abundant on the milk mucin than the HMFG-1 epitope, while it is readily 
detectable on the mucin molecules expressed by tumours (1). These molecules have a smaller molecular weight and 
5 may be less heavily glycosylated or polymerized. 

Here we have reported the development of new antibodies which are reactive with the protein core of the mucin 
and with the partially deglycosylated molecule, but which are unreactive with the fully processed mucin produced by the 
lactating mammary gland. One of these antibodies SM-3. which is an IgGl . has been studied in more detail. It has been 
shown to react with the mucin molecules which are produced by breast cancer cells and are recognised by many anti- 

w bodies developed against the intact milk mucin. It should be emphasized however that the epitope recognised by SM- 
3 which is on the core protein and is exposed in the mucin as processed by tumour cells, is not exposed on the normally 
processed milk mucin. This feature offers the possibility of enhanced tumour specificity, and a pilot immunohistochem- 
ical study of breast tumours and tissues has shown that indeed the SM-3 antibody reacts strongly with the majority of 
primary breast cancers (91%) but shows little or no reaction with benign breast tumours, resting or lactating breast and 

is most normal tissues. 

There are several implications of the work described here which may be important for both basic and clinical stud- 
ies in breast cancer. The observation that parts of the core protein (detectable by antibodies) are exposed on the 
mucins as processed by breast cancer but masked on the mucin as processed by cells in normal breast and benign 
tumours implies that there is an alteration in the processing of the mucin in malignancy. A more detailed study of the 
20 processing of the mucin in normal and malignant cells may then give basic information for defining the malignant cell. 
Moreover, since the specificity of the reaction of the antibody SM-3 for tumours is better than that of antibodies devel- 
oped against the intact mucin, this antibody may prove to be a more effective diagnostic tool for the detection of breast 
cancer cells in tissue sections, tissue fluids and cells. The reactive components are membrane associated as well as 
Intracellular and in vivo localisation of tumours may also be possible. 

25 

Abbreviations 

The abbreviations used are: HMFG, human milk fat globule; PBS. phosphate-buffered saline (153 mM NaCI. 3 mM 
KCL. 10 mM Na 2 HP0 4 . 2 mM KM 2 P0 4 pH 7.4); WGA. wheat germ agglutinin; PNA, peanut agglutinin; HPA. Helix 
30 pomatia agglutinin; BSA. bovine serum albumin; SDS. sodium dodecyl sulfate. 

Example 2 

Purification and deglycosylation of human milk mucin was conducted as in Example 1 mucin was purified on an 
35 HMFG-1 antibody. 

The stripped mucin preparations were separated by electrophoresis through NaDodSO^olyacrylamide gels 
(10%) and silver stained by two methods, one of which can be used to stain highly glycosylated proteins (22.23). 

Preparation of polyclonal rabbit antiserum to stripped c ore protein 

40 

One New Zealand White rabbit was immunized with 100 \ig of the partially stripped core protein in complete Ro- 
und's adjuvant (Gibco). Booster injections of 500 \ig of the totally stripped core protein were administered in incomplete 
Freunrf s adjuvant (Gibco) 3 and 4 weeks after the initial injection and the rabbit was Wed one week later. Ten microliters 
of immune serum (75 ng/ml protein) precipitated 200 ng of fully stripped core protein in a Protein A assay (24) and 
45 detected it on immunoblots. The immunoglobulin fractions of rabbit preimmune and rabbit anti-mucin core protein were 
prepared by adding ammonium sulfate to 50% saturation. The resulting pellet was resuspended In one-half the original 
serum volume of PBS and diatyzed against the same buffer. After dialysis, only residual precipitate was removed by 
centrifugation. Immunoglobulin fractions were stored in aiiquots at -20 P C. 

so Description of MAbs used 

In addition to the polyclonal antiserum used for initial screening, a cocktail of two MAbs. SM-3 and SM-4 (see Example 
1) which recognise the mucin core protein (20) and HMFG-1 and HMFG-2 (1.14) were used to screen the purified 
plaques, the p-galactosidase fusion proteins and for immunoprecipitations from in vrtro^ translated proteins'. Other 
55 MAbs used were a monoclonal anti-p-galactosidase antibody (25) which was a gift from H. Durbin (ICRF. London), an 

•The MAbs SM-3 and SM-4 (SM refers lo stripped mucin) show strong reactivity with the partially and fully stripped core pro^ 
lein bul no reactivity with the fully glycosylated mucin (20). 
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anti -interferon antibody, ST254 ( 61. a keratin antibody (26) and M18 which UBgnizes a carbohydrate structure 
on the milk mucin (27). 



In Vitro translation of proteins 

RNA was isolated from the human breast cancer cell line MCF-7 using the guanidium isothiocyanate method of 
Chirgwin et al. (28) and poly(A)* RNA was purified by chromatography using oligo (dT>cellulose (New England Bio 
Labs). The poly(A)* RNA was translated in a reticulocyte lysate system (Amersham) In the presence of [ S] methio- 
nine (1000 Ci/mmole; 1 Ci = 37 GBq, Amersham). Samples containing 5 x 10 4 acid insoluble cpm were precipitated in 
a protein A assay (24) using MAbs SM-3. SM-4. HMFG-1. HMFG-2 and a control antibody to human interferon. The 
antibody-selected proteins were then separated on a 10% NaDodSCVpolyacrylamide gel, impregnated with Amplify 
(Amersham) and exposed to XAR-5 film (Kodak) at -70*C. 



Antibody screening of Xgt1 1 library and protein blotting 

The Xgt1 1 library used in this study was constructed from mRNA isolated from the human breast cancer cell line 
MCF-7 and was generously provided by Philippe Walter and Pierre Chambon (Strasbourg, France). The poly (A) + RNA 
used for the preparation of the randomly primed library was prepared from mRNA that sedimented faster than 28S 
r RNA and was enriched in estrogen receptor (29). The library was made essentially as described by Huynh et al. and 
Young and Davis (30-32) and contained approximately 1 x 10 6 recombinants per jig of RNA. Between 85% and 95% of 
the plaques contained inserts. 

The phage library was plated onto bacterial strain Y1090 and grown for 3 hr at 42°C. After isopropyl p-D-thiogaiac- 
toside (IPTG) induction and 3 hr of growth at 37°C. filters were prepared from each plate and screened with anti-mucin 
core protein antibody by the method of Young and Davis (32). The first antibody used in screening was the rabbit antise- 
rum raised against the stripped core protein prepared as described above. Prior to use in screening, the antiserum was 
diluted 1 :200 in PBS containing 1% bovine serum albumin (PBS/BSA). Preabsorption with Y1090 bacterial lysate was 
not found to be necessary. The nitrocellulose filters (Schleicher and Schuell) were blocked by incubation in PBS con- 
taining 5% BSA for 1 hr at room temperature with gentle agitation. The filters were incubated at room temperature over- 
night with a 1 :200 dilution of antiserum in heat sealed plastic bags. The filters were washed 5 x 5 min in PBS/BSA. and 
bound antibody was detected by using horseradish peroxidase-conjugated sheep anti-rabbit antiserum (Dako) diluted 
1 :500 with PBS/BSA and incubated for 2 hr with the filters. The filters were washed 5 x 5 min in PBS/BSA and 1 x 10 
min in PBS before color detection using 4-chloro-1-naphthol (1). Immunoreactive bacteriophage were picked and puri- 
fied through two additional rounds of screening. Subsequently, bacteriophage inserts were subcioned into the EcoRI 
sites of pUC8 (33) producing the plasmid used most extensively. pMUC 1 0. The plasmids were maintained in DH1 cells. 

To examine the p-galactosidase-cDNA fusion proteins for immunoreactivity. cell lysates were derived. Lysogens 
were prepared as described in Young and Davis (34). Cells were pelleted, suspended in Laemmli sample buffer (35) 
and separated by electrophoresis through NaDodSO^/ipolyacrylamide gels (10%) and transferred onto nitrocellulose fil- 
ters as described (1 .36). The f ilters ware treated as above for antibody screening. 



Northern Analysis 

RNA was isolated from tissue culture cells and frozen tissues by the guanidinium isothiocyanate method of Chirg- 
win et al. (28). Total RNA (10 \jlq per lane) was denatured by heating at 55°C for 1 hr in deionized glycocal and fraction- 
ated by electrophoresis through a 1.3% glyoxal gel (38). The RNA was transferred to nitrocellulose (Schleicher and 
Schuell). prehybridized and hybridized as described by Thomas (34). Filters were washed down to 0. 1 X SSC with 0.1% 
SDS at 65°C and exposed to XAR-5 film (Kodak) at -70°C with intensifying screens. 



Southern analysis 

High molecular weight genomic DNA was prepared from white Wood cells and cell lines (39.40). These genomic 
DNAs (10*ig) were cleaved with restriction enzymes following the manufacturer's recommended conditions and frac- 
tioned through 0.6% and 0.7% agarose gels. Cloned plasmid DNA was cleaved and fractionated on 1 .3% agarose. The 
gels were denatured, neutralized and transferred to nylon membranes (Biodyne) according to the manufacturer's 
instructions The EcoRI insert from pMUClO was separated on a 1% low melting point agarose (Biorad) gel and 
labelled with [a- 32 P]dCTP by the method of random priming (41) and hybridized to filters at 42'C. Filters were washed 
down to 0.1X SSC with 0.1% SDS at 55*C and exposed to XAR-5 film (Kodak) at -70*C with intensifying screens. 
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PMrrfiration arvl rieglvcosvla tio n of mucin glycoprotein 

fied molecule, ammo ac.d a T«^ SSe accounting tor 58% of the amino acids Periodic acid silver • 

position with 2fS>SS2w only when the gel was treated with periodic 

silver stain without prior treatment with per.od.c acid. O-linked sugars that are 

The purHied materia, was^iecied to tre^^^ • 

characteristic ^"f^J^ ££££££1 «• P'° tein < treated at r °° m temperatUfe for 3 ' 

S'i^^ ias - fu,,y protein ? 

showed no reactivity with any of these three lectins. oi^rnnhoresis throuah NsDodSOVpolyaaylamide 

^^^^^^^^^^^^^ 
fully stripped mucin consisted of two bands of about 68 kd and 72 kd (F.g. 8. lane 2). 
Antibody reactive proteins produced by MC F-7 cells 

TheMCF-7breastcanc^^^ 
( 1 4) and was thus judged to be a su.table source of mRN A for a cDN A fcbrary «eo □ ^ 
library with the monoclonal antibodies, they were tested ^f'^^ ^translated invjtro. Proteins 

tion products produced from MCF-7 mRNA. Poly (A) + RNA from MCF-7 was P^P^red arw I trans, 
rom the translation reaction were immunoprecipitated using tm >"«o*£ SfJtS pSe^s oTS^ SB Kd and 
SM-4 and displayed by 

92 kd were .mmunopreciprtatedby SM-3 (lane 2) ana sw-* v«« > precipitated by an irrelevant mon- 

1 and -2 are. at least in part, protein in nature. , , M+RWA wa<5 d% as estimated by comparing the 

protein during in vitro translation. 
< frrpenino o * t h ft C PNA ltbfarv 

The xgtl 1 CDMA library made from size selected MCF-7 mRNA ^^^^ e ^S^f 2xS 

^Todemonstratethat the reactivity o, the phage clones with the a,^ 

on the cDNA translation product, ^actosidase fusi^^^^ the stripped 

arated by electrophoresis, transferred to nrtrocellulose paper and prcjbedwim a va " ^ SM . 3 and SM _ 

mucin, including the polyclonal antiserum which was used I .rubaOy toj Mto^ ta*m. ^^Xentiation and 
4. in addition. HMFG-1 and HMFG-2. the two monoclonal antbod.es ^^^SShlere specifically rec, 
tumour-associated eprthelial mucin (1.14) were tested. All 7 f^^^^^^ZJ^ 6 of the 7 
ognized by the polyclonal antiserum, the monoclonal cockta.l. and HMFG-2. HMFG 1 antioooy re 
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fusion proteins and failed to rec^^te the protein from clone 9 which contains l^Rnallest insert. In every case the 
strongest signal was given by the HMFG-2 antibody and this reaction is shown in Figure 10. Monoclonal antibodies to 
keratins and to a carbohydrate epitope on this fully glycosylated mucin were used as controls and showed no reactivity 
A monoclonal antibody to p-galactosidase was a positive control and the band recognized correlated in every case with 
the band recognized by the specific antibodies. The sizes of the fusion proteins varied in proportion to the sizes of the 
cDNA inserts found in the bacteriophage. 



Characterization of cDNAs and RNA blot analysis 



w The inserts from the X clones were designated pMUC3-10 (omitting pMUC5) and were subcloned into the vector 
pUC 8 for easier manipulation. The 7 clones were compared to each other for sequence homology. Each of the plas- 
mids was digested with EcoRI and the insert separated on a 1 .4% agarose gel. The largest cDNA insert from pMUClO 
was used to probe the inserts and found to hybridize to all 6 inserts (Fig.1 1). pMUC 7 was found to contain two inserts 
following digestion with EcoRI; however, only 1 of the inserts hybridized to the pMUClO probe. The insert bands were 

?5 not derived from phage DNA since the pMUClO probe did not hybridize to Hind Ill-digested X phage DNA 

As shown by agarose gel electrophoresis (Fig.1 1), the inserts vary in size from about 200 to up to about 1800 bp. 
The largest insert from pMUClO has been used as the hybridization probe in all subsequent experiments. 

Because the XMUC clones were identified only by antibody binding, we needed additional assurance that they were 
indeed coding for the breast epithelial mucin. To determine the authenticity of pMUC10, we correlated the presence of 

20 mRNA hybridizing to the clone with mucin expression in various cell lines. As shown in figure 12, the cDNA hybridized 
to two transcripts of 4.7 kb and 6.4 kb in the RNA from the breast cancer cell lines MCF-7 and T47D which were shown 
previously to express the HMFG-2 antigen (1.14). Significantly, the pMUClO probe hybridized to transcripts of approx- 
imately the same size in RNA extracted from normal mammary epithelial cells cultured from milk (42). A third band of 
5.7kb can be seen in the RNA from these normal cells. In contrast, three human cell types that lack the mucin, breast 

25 fibroblasts, Daudi cells and HS578T, a carcinosarcoma line derived from breast tissue (43), showed no detectable 
pMUClO-related mRNA. The 6.4 kb band appears to be the most adundantly expressed. The presence of at least two 
sizes of mRNA from MCF-7 cells correlates with the immunoprecipitation of two proteins of (molecular weights 68 kd 
and 92 kd) from in vitro translated mRNA from MCF-7 cells. The normal mammary epithelial cells were derived from 
pooled milk samples and the additional transcript observed may be due to polymorphisms among individuals. 

30 

Genomic DNA blot hybridization and detection of a restriction fragment length polymorphism (RFLP> 

Genomic DNA was prepared from a panel of ten individuals consisting of six unrelated individuals and a family of 
four, and from three cell lines. The DNAs which were digested with Hint I or EcoRI and blotted and hybridized to the radi- 

35 olabelled pMUCl 0 insert, exhibit restriction fragment length polymorphisms. The restriction fragments from the ten indi- 
viduals sod three cell lines are shown in figure 13. The pattern consists of either a single band or a doublet of sizes 
ranging from 3400bp to 6200bp in the Hinfl digest (with the exception of the ZR75-1 DNA in lane 12. figure 13A which 
shows three bands) or from 8200bp to 9600bp in the EcoRI digest (Figure 13B).There appears to be a continuous dis- 
tribution of the fragment sizes which implies a high in vivo instability at the locus. The pattern of fragments observed in 

4Q the family of four (lanes 1-4) suggests that these fragments are allelic. Preliminary studies of the DNA made from white 
blood cells of normal, related individuals indicate the existance of a number of independent alleles with an autosomal 
codominant mode of inheritance These studies will be the subject of a separate investigation. 

Discussion 

The cDNA clones described here which were obtained from the MCF-7 Xgt1 1 library were selected using polyclo- 
nal and monoclonal antibodies prepared against a normal cellular product, the milk mucin in its deglycosylated form. 
This was done because it was easier to obtain large quantities of the mucin for stripping than to prepare similar quan- 
tities of immunologically related glycoproteins expressed by breast cancer cells (44). The fact that the antibodies did 

so select for cDNA coding for nonglycosytated core protein molecules in MCF-7 cells, strongly suggests that the glycopro- 
teins in these cells, which were originally detected by their reaction with antibodies to the milk mucin, contain the same 
core protein as this mucin. This is confirmed by the detection of mRNAs of approximately the same sizes in the normal 
and malignant cells, using one of the probes isolated from the MCF-7 library. We will therefore refer to the antibody 
reactive glycoproteins on breast cancer cells as mucins, bearing in mind that their processingjmay be different resulting 

55 in molecules of different molecular weights but with the same core protein as that of the milk mucin. 

Seven clones ware obtained from the MCF-7 library of which the largest was 1800kb. This clone cross hybridized 
with the other 6 smaller clones. The p-galactosidase fusion proteins expressed by six of the cross-hybridizing lambda 
clones were reactive with the polyclonal antiserum directed against the mucin core protein as well as with four well- 
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characterized monoclonal antil 



directed to various epitopes on the stripp< 



protein. SM-3. SM-4. HMFG-1 



and HMFG-2 (14.20). The smallest lambda clone, AMUC9. produced a p-gaiactosidase fusion protein which reacted 
with three of the four monoclonal antibodies and with the polyclonal antiserum. 

The surprising result that the extensively characterized HMFG-1 and HMFG-2 monoclonal antibodies reacted 

5 strongly with the lambda plaques and the fusion proteins and could immunoprecipitate proteins from in vitro translated 
mRNA provides strong evidence that these clones do indeed code for a portion of the mucin core protein. Although pre- 
vious evidence such as resistance to fixation, boiling, treatment with dithiothreitol and Na0odSO 4 and the presence of 
multiple epitopes on the molecule suggested that these were carbohydrate (1). it has now been established that the 
epitopes of the HMFG-1 and HMFG-2 monoclonal antibodies are definitely protein in nature. Carbohydrate may be 

io required to obtain the strongest binding, either as part of the epitope or by conferring some conformational change on 
the protein portion, but part of the antigenic determinant must consist of an amino acid sequence. Since these two 
MAbs are reactive with the fully glycosylated milk mucin as well as the stripped core protein, this data means that the 
intact molecule contains areas of naked peptide which contribute to the antigenic sites for these two antibodies. 

Confirmatory evidence that pMUClO codes for the mammary mucin core protein is provided by RNA blots. The rel- 

15 ative abundance of mRNA in the breast cancer cell lines MCF-7, T47D, ZR-75-1 and in normal mammary epithelial cells 
corresponds to the antigen expression by these cells as measured by the binding of the HMFG-1 and HMFG-2 mono- 
clonal antibodies. Ceil types which are negative for antigen expression such as human fibroblasts, Oaudi cells and 
HS578T, a carcinosarcoma line derived from breast (14), are negative in RNA blot hybridizations. A fortuituous obser- 
vation made with the ZR-75-1 cells yielded indirect strong evidence that pMUCIO does indeed code for the mucin glyc- 

20 oprotein core protein. This cell line, which routinely expresses large amounts both of mRNA and antigen, yielded one 
preparation of RNA which was unexpectedly negative by blot hybridization! It was subsequently found that those par- 
ticular ZR-75-1 cells from which the RNA had been made had lost the expression of the antigen as well at this time (as 
determined by reaction with HMFG-1 and 2). Different passage numbers of the ZR-75-1 cells were recovered and 
shown once again to express both antigen and message. The sizes of the messages, 4.7 kb and 6.4 kb, are quite large, 

25 since a 68 kd or 92 Kd protein would need only about 3 kb to code for the protein portions. This suggests that a large 
portion of the mRNA maybe untranslated. Efforts are underway to obtain a full-length clone. 

Thus, the cDNA clones presented here represent a portion of the gene coding for the human mammary mucin 
which is expressed by differentiated breast tissue as well as by most breast cancers. The major proteins precipitated 
from in vitro translation products of RNA from MCF-7 cells by antibodies to the milk mucin core protein (68Kd) have so 

30 apparent molecular weight of 68Kd and 92Kd. These proteins, produced by the breast cancer cell therefore share 
epitopes with the 68Kd core protein of the milk mucin (20). Whether a similar 92Kd protein is also produced by normal 
mammary epithelial cells, and is truncated or destroyed by HF treatment is not yet clear. MCF-7 cells biosyrrthetically 
labelled with 14C amino acids yield upon immunoprecipitation with HMFG-1 and HMFG-2 antibodies, two glycosylated 
proteins of 320 kd and 430 kd and it is possible that each of these glycoproteins utilizes only one core protein of either 

35 68Kd or 92Kd. Alternatively, each of the glycoproteins could contain both the 92Kd and 68Kd proteins either in different 
proportions or variably glycosylated. Further screening of the library may yield full length cDN As coding for both sizes 
of the immunologically related core proteins. Since there appears to be only a single gene (based on Southern blot data 
obtained by using a partial cDNA probe), it is probable that the multiple messages arise by alternative RNA splicing and 
this would explain the fact that they contain common sequences. Although a core protein of 68 kd appears to be small 

40 to yield a fully glycosylated molecule of greater than 300 kd which contains 50% carbohydrate, there is evidence that 
such a structure for mucins is possible. Ovine submaxillary mucin has a reported molecular weight of 1 x 10 6 daltons 
(45), yet its protein core consists of 650 amino acids resulting in a molecule of 58 kd (46). 

The mucins which are detected with HMFG-1 and HMFG-2 MAbs on immunoblots of tumours and breast cancer 
cell lines show variations in size from 80 kd to 400 kd in the molecular weights of the tumour mucin molecules (1,47). 

45 Using these same antibodies which detect high molecular weight mucins present in normal urine, a polymorphism has 
indeed been shown to be genetically determined (48). Although the very low molecular weight components are likely to 
represent precursor forms of the mucin which appears to be incompletely processed in many tumour cells (20), the var- 
iations in the higher molecular weight components are likely to be due to this genetic polymorphism. It was unclear, 
however, whether the structural basis of the polymorphism was due to the genetically determined protein or to the car- 

50 bohydrate portion of the mucin. TTie detection of restriction fragment length polymorphisms in the Southern blotting 
experiments using the mucin probe suggest that the mucin polymorphism occurs at the level of the DNA which codes 
for the protein. Preliminary sequence data suggest that the basis for this polymorphism is a region of variable tandem 
repeats present in the protein coding sequences. This structural feature may be responsible for the generation of the 
many allelic restriction fragments at the mucin locus. We are presently investigating the-basis of the mucin polymor- 

55 phism by a Southern blot survey of DNA from white blood cells of normal, related Individuals whose inheritance pattern 
of urinary mucins has been determined. In addition, we are examining DNA preparations made from the white blood 
cells and tumours of individual breast cancer patients to determine if there is any discordance between genotype in the 
paired samples, since tandemly repeated DNA may provide an unstable site where recombination or amplif ication could 
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The presence of mucins in the majority of carcinomas and their association with the differentiation of mammary epi- 
thelial cells makes it particularly important to identify regions involved in the tissue specific and developmental regula- 
tion of the gene. Moreover, the introduction of a functional mucin gene into cells should provide insights into the role of 
s this molecule in breast epithelial differentiation and possibly enable us to identify any alterations in the function or 
expression of the mucin which are related to malignant transformation in the human breast. 



Abbreviations 

w The abbreviations are as follows: PBS. phosphate-buffered saline; MAb. monoclonal antibody; IPTG. isopropyl p- 
D-thiogalactoside; bp, base pair(s); Kb. kilobase(s). 



TABLE 1 



75 


Amino acid composition of the human milk mucin - comparison with PAS-O 




Amino acid 


HMFG-1 purified milk 
mucin 


Extensively stripped milk 
mucin 


PAS-O (Shimizu & 
Yamauchi 1982) 




Asp 


fi 1 

O. 1 


7.2 


6.4 


20 


Thr 


9.4 


Q 7 

y. / 






Oci 


Q 1 

• 


13.0 


13.1 




Glx 


6.3 


9.6 


8.3 


25 


Pro 


14.8 


14.4 


12.0 




Gly 


8.1 


10.1 


12.2 




Ala 


12.3 


11.9 


13.0 




Cys 


Not analysed 


Not analysed 


0.5 


30 


Val 


6.0 


6.3 


5.3 




Met 


0.5 


0.4 


0.8 




tie 


1.6 


1.7 


1.9 


35 


Leu 


4.5 


4.8 


3.7 




Tyr 


2.0 


0.9 


1.6 




Phe 


2.0 


1.6 


1.7 




His 


3.2 


2.3 


3.8 


40 


Lys 


2.8 


3.3 


2.2 




Arg 


4.0 


4.0 


3.9 



45 



50 
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Table 2 



Reactivity of the antibodies on intact, partially and totally deglycosylated milk mucin 




I ntVct molecule j Partially stripped mucin Totally stripped mucin 



5" 17 " 8.524 11. 925 - 5.780 

9.13 525 3.000 3.328 

SM-3 465 15.414 9.200 

SM-4 816 16.750 9.561 

HMFG-1 . 32.000 33.768 9.494 

HMFG-2 29.500 29.230 15.832 

NS2 medium 397 845 650 



The binding of the antibodies to iodinated intact, partially and totally deglycosylated milk mucin was assayed using 
the protein A plate method as described in Materials and Methods. 
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Claims 

1 . A polypeptide comprising 5 or more amino acid residues in a sequence corresponding to the sequence (I) 
Val Thr Ser Ala Pro Asp Thr Arg Pro Ala Pro Gly [Ser Thr Ala * v " 
Pro Pro Ala His Gly ValjfThr Ser Ala Pro Asp Thr Arg Pro Ala 
Pro Gly Ser Thr Ala Pro Pro Ala His Gly 

(I) ' 
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2. 



A polypeptide according to dsn^l having 20 or more amino acid residues ii 
sequence (I) 

Val Thr Ser Ala Pro Asp Thr Arg Pro Ala Pro Gly Ser Thr Ala 
Pro Pro Ala His Gly Val Thr Ser Ala Pro Asp Thr Arg Pro Ala 
Pro Gly Ser Thr Ala Pro Pro Ala His Gly 



luence corresponding to the 



0) 



3. A polypeptide according to claim 1 or claim 2 wherein at least one amino acid residue bears a linkage sugar sub- 
stituent. 

4. A polypeptide according to claim 3 wherein the linkage sugar bears an oligosaccharide moiety. 

5. A polypeptide according to claim 3 or claim 4 wherein the or each amino acid bearing a substituent is a serine or 
threonine and the linkage sugar is N-acetyt galactosamine. 

6. A polypeptide according to any one of claims 3 to 5 linked to a carrier protein. 

7. An antibody or fragment thereof against a polypeptide according to any one of claims 1 to 6 which antibody or frag- 
ment has reduced or substantially no reaction with fully processed human epithelial mucin glycoprotein. 

8. A monoclonal antibody or fragment thereof according to claim 7. 

9. A hybridoma cell capable of secreting a monoclonal antibody according to claim 8. 

10. A hybridoma cell of the cell line designated HSM3 (ECACC 87010701). 

1 1 . A monoclonal antibody secreted by HSM3 (ECACC 8701 0701 ). 

1 2. A nucleic acid fragment comprising at least 1 7 nucleotide bases the fragment being hybridisable with at least one of 



a) the DNA sequence 



5' 



ACC GTG GGC TGG GGG GGC GGT GGA GCC 



CGG— 



GGC CGG CCT GGT GTC CGG GGC CGA GGT 



GAC — 



ACC GTG GGC TGG GGG GGC GGT GGA GCC 



CGG— 



3' 



GGC CGG CCT GGT GTC CGG GGC CGA GGT 



GAC 



b) DNA of sequence 
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5' 

GTC ACC TCG GCC CCG GAC ACC AGG CCG GCC- 

* 

CCG GGC TCC ACC GCC CCC CCA GCC CAC GGT- 

GTC ACC TCG GCC CCG GAC ACC AGG CCG GCC- 

3' 

CCG GGC TCC ACC GCC CCC CCA GCC CAC GGT 



so c) RNA having a sequence corresponding to the DNA sequence of a) and 



d) RNA having a sequence corresponding to the DNA sequence of b). 

13. A nucleic acid fragment according to claim 12 comprising a portion of at least 30 nucleotide bases capable of 
hybridising with at least one of sequences (a) to (d). 

25 

14. A DNA fragment according to claim 12 or 13. 

1 5. A double stranded DNA fragment comprising antiparallel paired portions having respectively sequences (a) and (b) 
as defined in claim 12. 

30 16. An antibody or fragment thereof according to any one of claims 7. 8 and 1 1 bearing a detectable label or a thera- 
peutically or diagnostically effective moiety. 

17. An antibody or fragment thereof according to any one of claims 7. 8. 1 1 and 16 for use in a method of therapy or 
35 diagnosis practised on the human or animal body. 

18. Human polymorphic epithelial mucin core protein bearing a detectable label or a therapeutically or diagnostically 
effective moiety. 

40 19. Human polymorphic epithelial mucin core protein according to claim 18 for use in a method of therapy or diagnosis 
practised on the human or animal body. 

20. A polypeptide according to any one of claims 1 to 6 bearing a detectable label or a therapeutically or diagnostically 
effective moiety. 

45 21. A polypeptide according to any one of claims 1 to 6 and 20 for use in a method of therapy or diagnosis practised 
on the human or animal body. 

22. A nucleic acid fragment according to any one of claims 12 to 15 bearing a detectable label or a therapeutically or 
so diagnostically effective moiety. 

23. A nucleic acid fragment accoraing to any one of claims 12 to 15 and 22 tor use in a method of therapy or diagnosis 
practised on the human or animal body. 

55 24. An assay method comprising contacting a sample suspected to contain abnormal human mucin glycoproteins with 
an antibody or fragment thereof according to any one of claims 7. 1 1 and 1 6. 

25. A diagnostic or therapeutic method practised on the human or animal body comprising administering an antibody 



20 



20 



25 



30 



35 



EP 0 823 438 A2 

or fragment thereof accordi 



li^Jlny one of claims 7. 11. 16 and 1 7. 



26. A diagnostic or therapeutic method practised on the human or animal body comprising administering human poly- 
morphic epithelial mucin core protein according to any one of claim 18 or claim 19. 

5 

27. A diagnostic or therapeutic method practised on the human or animal body comprising administering a polypeptide 
according to any one of claims 1 to 6. 20 and 21 . 

28. A diagnostic or therapeutic method practised on the human or animal body comprising administering a nucleic acid 
w fragment according to any one of claims 12 to 15, 22 and 23. 
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